Methods of detecting long range chromosomal interactions

ABSTRACT

The present invention relates to a method of monitoring epigenetic changes comprising monitoring changes in conditional long range chromosomal interactions at at least one chromosomal locus where the spectrum of long range interaction is associated with a specific physiological condition, the method comprising the steps of: —(i) in vitro crosslinking of said long range chromosomal interactions present at the at least one chromosomal locus; (ii) isolating the cross linked DNA from said chromosomal locus; (iii) subjecting said cross linked DNA to restriction digestion with an enzyme that cuts at least once within the at least one chromosomal locus; (iv) ligating said cross linked cleaved DNA ends to form DNA loops; (v) identifying the presence of said DNA loops; wherein the presence of DNA loops indicates the presence of a specific long range chromosomal interaction.

FIELD OF INVENTION

The current invention relates to methods of monitoring epigeneticchanges, diagnosing specific physiological conditions and the use ofantisense RNA for the treatment of physiological conditions.

BACKGROUND

The eukaryotic genome is organized into complex higher order structures;in fact, early electron micrographs of extracted chromatin revealed anon-histone scaffold forming radial loops (Earnshaw and Laemmli, 1983, JCell Bio 96, 84-93). Unrestrained negative superhelicity in mammaliangenomes suggests chromosome “domain” sizes on the order of tens ofkilobases (Kramer and Sinden, 1997, Biochem 36, 3151-3158). Morerecently, using 3C technology (capturing chromosome conformation) higherorder long range interactions have been demonstrated to exist in a widevariety of eukaryotes including S. cerevisiae (Dekker et al, 2002,Science 295, 1306-1311), fly (Blanton et al, 2003 Genes Dev 17,664-675), mouse (Tolhuis et al, 2002, Mol Cell 10, 1453-1465) and human(Carroll et al, 2005, Cell 122, 33-43) cells, generally kilobases insize. In mammalian cells, association of enhancer or locus controlregions with actively expressed genes has been demonstrated in theβ-globin (Tolhuis et al, 2002, Mol Cell 10, 1453-1465) and C-reactiveprotein (Choi et al, 2007, Nucleic Acids Res 35, 5511-5519) loci, aswell as looping together of recombining immunoglobulin genes (Skok etal, 2007, Nat Immunol 8, 378-387). In mammalian cells, ongoingtranscription is proposed to drive genome organization and gene looping(Chakalova et al, 2005 Nat Rev Genet 6, 669-677; Marenduzzo et al, 2007,Trends Genet 23, 126-133 and references therein) but recent analysessuggest that, at least for the β-globin locus, some long range DNAinteractions are maintained after transcription is inhibited (Palstra etal, 2008 PLoS ONE 3, e1661) arguing against models suggesting thatengaged RNA polymerase functions as ties of chromatin loops.

In the yeast genome, active transcription does appear to be importantfor the formation of “gene loops”, long range interactions that link the5′ and 3′ regions of active genes (Ansari and Hampsey, 2005 Genes Dev19, 2969-2978, O'Sullivan et al, 2004, Nat Genet 36, 1014-1018; Singhand Hampsey, 2007, Mol Cell 27, 806-816). Chromatin immunoprecipitation(ChIP) demonstrates the presence of TFIIB and the phosphorylated form ofRNAPII on both promoters and terminators. Moreover, functional TFIIB isrequired to form these long range interactions (Singh and Hampsey, 2007,Mol Cell 27, 806-816). TFIIB is capable of interacting with non-codingRNA and loss of TFIIB promoted by the non-coding RNA leads to loss oflong range interaction at DFHR (Martianov et al, 2007, Nature 445,666-670).

High levels of non-coding RNA transcribed throughout the genome includetranscripts antisense to open reading frames. They feature in botheukaryotic and prokaryotic genomes (Johnson et al, 2005, Trends Genet21, 93-102, Kapranov et al, 2007, Nat Rev Genet 8, 413-423; Selinger etal, 2000, Proc Natl Acad Sci USA 103, 4192-4197). In eukaryotes, many ofthese transcripts are never destined for translation into protein andsome are targeted for exosome-mediated degradation by the TRAMP complex(Bickel and Morris, 2006 Mol Cell 22, 309-316). In yeast these crypticunstable transcripts (CUTs) are detected at promoters (Berretta et al,2008, Genes Dev 22, 615-626; Davis and Ares, 2006, Proc Natl Acad SciUSA 103, 3262-3267), in intergenic regions (Wyers et al, 2005, Cell 121,725-737) and antisense to genes (Camblong et al, 2007 Cell 131, 706-717;Uhler et al, 2007, Proc Natl Acad Sci USA 104, 8011-8016).

Detailed experiments profiling RNAs from S. cerevisiae using genomictiling arrays have shown that transcription occurs in virtually allparts of the yeast genome (Perocchi et al, 2007, Nucleic Acids Res 35,e128; Samanta et al, 2006, Proc Natl Acad Sci USA 103, 4192-4197; Miuraet al et al, 2006, PNAS 103, 17846-17851; Hongay et al, 2006, Cell 127,735-745; David et al, 2006, Proc Natl Acad Sci USA 103, 5320-5325;Havilio et al, 2005, BMC Genomics 6, 93). These analyses also indicatethat between 100 and 370 genes are transcribed at least partially inboth directions producing stable polyadenylated sense and antisensetranscripts. Many of these genes are actively transcribed.

The current inventors have used the GAL locus as a model system in whichto compare induced and repressed states and observe differences inantisense transcript and epigenetic regulation. The inventors show thatthere are antisense transcripts controlling both the induced andrepressed states at the GAL locus, and that these transcripts differ insize, position and abundance. Highly abundant antisense transcripts atthe induced locus are associated with the production of the sensetranscript from the GAL10 promoter. Moreover, the levels of antisensetranscripts strongly correlate with levels of Hda1 associated with thelocus but not with histone acetylation itself. The inventors alsoidentify that Hda1 appears to be required for long range interactions atthe repressed GAL locus, suggesting a link between the antisensetranscripts, epigenetic modifications and higher order chromatinstructures in the repressed state. Changes in the conformation of thelocus upon switching from the repressed to the induced state have beenidentified with implication of antisense RNA in controlling this. Thecurrent invention is based on the discovery that gene repression is aproactive state of regulation which involves production of antisensetranscription and specific epigenetic changes the locus.

SUMMARY

According to a first aspect of the invention there is provided a methodof monitoring epigenetic changes comprising monitoring changes inconditional long range chromosomal interactions at least one chromosomallocus where the spectrum of long range interaction is associated with aspecific physiological condition, said method comprising the steps of: —

(i) in vitro crosslinking of said long range chromosomal interactionspresent at the at least one chromosomal locus;(ii) isolating the cross linked DNA from said chromosomal locus;(iii) subjecting said cross linked DNA to restriction digestion with anenzyme that cuts at least once within the at least one chromosomallocus;(iv) ligating said cross linked cleaved DNA ends to form DNA loops;(v) identifying the presence of DNA loops;wherein the presence of DNA loops indicates the presence of a specificlong range chromosomal interaction.

It will be understood that conditional long range chromosomalinteractions will always be present in chromatin. It will be furtherunderstood that these interactions are dynamic and will change dependingon the status of the region of the chromosome, i.e. if it is beingtranscribed or repressed in response to change of the physiologicalconditions

As used herein, the term conditional long range interactions refers tointeractions between distal regions of a locus on a chromosome, saidinteractions being dynamic and altering depending upon the status of theregion of the chromosome.

As used herein, the term spectrum of long range interaction refers tothe different conformations of long range chromosomal interactions whichmay be present at a given chromosomal locus. It will be understood thatas described above these interactions are dynamic, with various longrange interactions forming or breaking depending on the status of thelocus.

It will further be understood that the long range chromosomalinteractions can be cross linked by any suitable means. In a preferredembodiment, the long range chromosomal interactions are crosslinkedusing formaldehyde.

It will be further understood that the DNA loops present may beindicative of transcription or repression of said chromosomal locus, oralternatively, expression of an altered product from said chromosomallocus.

The presence of the DNA loops can be identified as described hereinbelow in relation to the GAL locus. It will be readily apparent to theskilled person that the method described in relation to this locus canbe adapted to be used at any other locus where long range interactionsare thought to occur. These loops can be detected using techniques knownin the art such as the 3C (Capturing Chromosome Conformation) assay(Dekker, 2006, Nat Methods 3, 17-21; Dekker et al, 2002, Science 295,1306-1311; O'Sullivan et al, 2004, Nat Genet 36, 1014-1018).

The skilled person will be aware of numerous restriction enzymes whichcan be used to cut the DNA within the chromosomal locus of interest. Itwill be apparent that the particular enzyme used will depend upon thelocus studied and the sequence of the DNA located therein.

The current invention is based on the surprising discovery by theinventors that conditional long range chromosomal interactions arealways present at a given locus on the chromosome and that the profileof conditional long range chromosomal interactions change depending onthe actual status of the region, it's activity and the physiologicalconditions, i.e. the presence or absence of a particular long rangeinteraction will provide an indication of the status of that region.

Moreover, the inventors have discovered that consistent with earliergenetic data these conditional long range chromosomal interactions mayoverlap and include the regions of chromosomes shown to encode relevantor undescribed genes, but equally may be in intergenic regions. Itshould further be noted that the inventors have discovered that longrange interactions in all regions are equally important in determiningthe status of the chromosomal locus. These long range interactions arenot necessarily in the coding region of a particular gene located at thelocus and may be in intergenic regions.

It will further be understood by the skilled person that the termepigenetic refers to heritable changes in gene function within a cellwhich are caused by changes other than changes to the underlying DNAsequence, these changes may be caused, for example, by environmentalfactors, DNA methylation, non-coding antisense RNA transcripts, nonmutagenic carcinogens, histone modifications, chromatin remodelling andspecific local long range DNA interactions all of which have beenimplicated in creating specific environment for defined transcriptionalactivity on the genes or non-coding RNA transcriptional units within theregion of interest.

It will be understood that the epigenetic changes may be caused bychanges to the underlying nucleic acid sequence, which themselves do notdirectly effect a gene product or the mode of gene expression, suchchanges may be for example, SNP's within and/or outside of the genes,and gene fusions and/or deletions of intergenic DNA.

It will further be apparent that the term specific physiologicalcondition refers to any condition in which there is a change in thedefined physiological status of the cell. This may be by a change in thelevel of expression of one or more genes, or a change in one or moregene product. Examples of such conditions include cancer—benign ormalignant growth, cardiovascular disorders, inflammatory conditions,including autoimmune disorders and inflammatory responses to thedeveloping infectious diseases, inherited genetic disorders modulated byepigenetic mechanisms and neurodegenerative diseases.

Preferably, the presence of the DNA loops is identified using PCRtechniques. It will be understood that the presence of a loop may beindicated by the presence of a PCR product which is absent in theabsence of DNA loop or vice versa. It will also be understood that thesize of the PCR product produced may be indicative of the specific DNAloop present and may therefore be used to identify the status of thelocus.

In one preferred embodiment, the presence of a DNA loop indicates analtered transcription state indicative of a specific physiologicalcondition.

In a second preferred embodiment, the absence of a DNA loop indicates analtered transcription state indicative of a specific physiologicalcondition.

It will be apparent to the skilled person that the method according tothe first aspect can be used not only to monitor the presence of aspecific long range chromosomal interaction at a chromosomal locus, butequally to monitor the absence of a specific long range chromosomalinteraction at said chromosomal locus.

Preferably, the physiological condition is selected from amongst cancer,cardiovascular disorders, inflammatory conditions, including autoimmunedisorders and inflammatory responses to infectious diseases, andinherited genetic disorders modulated by epigenetic mechanisms. Anyother condition which results in a change in at least one long rangechromosomal interaction may also be identified by the currents methods.

It will be understood that in any aspect of the present invention thechanges in the conditional long range chromosomal interactions of asample may be monitored by comparing the conformation of long rangechromosomal interactions at a locus at different time points in the sametissue or cell type or by comparison to a sample corresponding to aknown physiological state.

Furthermore, it should be understood that the long range chromosomalinteractions of the present invention do not relate to long rangeinteractions between genes and their regulatory elements such aspreviously described by Chambeyron et al., (2004), Curr Opin Biol. 16,256-262; de Laat et al., (20030 Chromosome res. 11, 447-459; and Dekker,(2003), J. Trends Biochem. Sci. 28, 277-280. Rather the presentinvention relates to conditional changes in the long range interactionswithin a particular locus as an indication of a switch in the activityof a gene.

According to a second aspect of the current invention there is provideda method of monitoring epigenetic changes comprising monitoring changesin conditional long range chromosomal interactions at least onechromosomal locus where the spectrum of long range interaction isassociated with a specific physiological condition, said methodcomprising the step of identifying a change in the antisense RNA profileexpressed from the at least one chromosomal locus.

It will be apparent to the skilled person that the change in theantisense RNA profile may be a change in the size, start position,and/or number of antisense RNA transcripts.

It will be understood by the skilled person that the phenomenon of theproduction of antisense RNA transcripts at repressed loci on chromosomesis known. However, the inventors have surprisingly discovered that theprofile of antisense RNA transcribed from a chromosomal locus changesdepending on whether the locus is induced or repressed and that theantisense RNA transcripts produced play a central role in controllingtranscription of the sense RNA transcript from a particular locus andthe epigenetic conditions at that locus, including the long rangeinteractions.

According to a third aspect of the current invention there is provided amethod of diagnosing a disorder associated with at least one epigeneticchange in a subject, said method comprising identifying a change in oneor more long range chromosomal interactions at least one chromosomallocus associated with said disorder in a sample isolated from thesubject; wherein said method comprises the method of either of aspectsone or two.

Preferably, the epigenetic, change results in altered transcription fromthe chromosomal locus.

It will be understood that the altered transcription can be upregulation, repression, or production of an alternative transcript witha changed start site and/or termination site, or a splice variant ofsuch.

It will be apparent that the epigenetic change causes a change in theexpression of at least one gene and/or transcriptional unit within thenon-coding part of the genome.

Preferably, the disorder is selected from amongst cancer, cardiovasculardisorders, inflammatory conditions, including autoimmune disorders andinflammatory responses to infectious diseases, and inherited geneticdisorders modulated by epigenetic mechanisms. Any other condition whichresults in a change in at least one long range chromosomal interactionmay also be diagnosed by the currents methods.

According to a fourth aspect of the current invention there is provideda method of regulating transcription of at least one gene in a patientsuffering from a disorder associated with altered gene expression, saidmethod comprising administering to said patient an antisense RNA in anamount effective to alter transcription of said at least one gene.

In one embodiment, the disorder results from over expression of said atleast one gene.

It will be apparent to the skilled person that the disorder can equallyresult from repression of said at least one gene, or from production ofan altered gene product from said at least one gene.

Preferably, the disorder is selected from amongst cancer, cardiovasculardisorders, inflammatory conditions, including autoimmune disorders andinflammatory responses to infectious diseases, and inherited geneticdisorders modulated by epigenetic mechanisms.

In a preferred embodiment, said antisense RNA targets at least one CTCFbinding site.

CTCF is a multifunctional factor, which as discussed below is implicatedin establishing and maintaining high order chromatin structures.

In a further preferred embodiment, administration of said antisense RNAresults in modulation of HDAC enzymes.

Histone acetylation is known to be involved with modulation oftranscription. It has been suggested that this modulation is alsocontrolled via antisense RNA.

HDAC enzymes are classified into four classes depending on sequenceidentity and domain organization. In a preferred embodiment, said HDACenzyme is selected from a Class i-iv HDAC enzyme.

Class I HDAC enzymes include HDAC1, HDAC2, HDAC3, HDAC8;Class II HDAC enzymes include HDAC4, HDAC5, HDAC6, HDAC7A, HDAC9,HDAC10;Class III HDAC enzymes include homologs of Sir2 in the yeastSaccharomyces cerevisiae, and sirtuins in mammals (SIRT1, SIRT2, SIRT3,SIRT4, SIRT5, SIRT6, SIRT7);Class IV HDAC enzymes include HDAC11.

According to a fifth aspect of the current invention there is provided amethod of regulating transcription of at least one gene in a patientsuffering form a disorder associated with altered gene expression, saidmethod comprising administering to said patient interfering RNAcomplementary to an antisense RNA molecule implicated in modulation ofsaid gene.

It will be apparent to the skilled person that the disorder can resultfrom over expression or repression of said at least one gene, or fromproduction of an altered gene product from said at least one gene.

Preferably, the disorder is selected from amongst cancer, cardiovasculardisorders, inflammatory conditions, including autoimmune disorders andinflammatory responses to infections diseases, and inherited geneticdisorders modulated by epigenetic mechanisms.

According to a sixth aspect of the current invention, there is providedantisense RNA for the treatment of a disorder associated with alteredgene expression, wherein said antisense RNA regulates transcription ofsaid gene.

According to a seventh aspect there is provided the use of antisense RNAin the manufacture of a medicament for the treatment of a disorderassociated with altered gene expression, wherein said antisense RNAregulates transcription of said gene.

Preferably, the disorder according to the fifth or sixth aspect isselected from amongst a cancer, cardiovascular disorders, inflammatoryconditions, including autoimmune disorders and inflammatory responses toinfectious diseases, and inherited genetic disorders modulated byepigenetic mechanisms.

In a first preferred embodiment, said RNA represses transcription ofsaid gene.

In a second preferred embodiment, said RNA induces transcription of saidgene.

In a preferred embodiment, said antisense RNA targets at least one CTCFbinding site.

In a further preferred embodiment, said antisense RNA modulates HDACenzymes.

It will be apparent to the skilled person that the above medicaments maybe formulated into pharmaceutical dosage forms, together with suitablepharmaceutically acceptable carriers, such as diluents, fillers, salts,buffers, stabilizers, solubilisers, etc. The dosage form may containother pharmaceutically acceptable excipients for modifying conditionssuch as pH, osmolarity, taste, viscosity, sterility, lipophilicity,solubility etc.

Suitable dosage forms include solid dosage forms, for example, tablets,capsules, powders, dispersible granules, cachets and suppositories,including sustained release and delayed release formulations. Powdersand tablets will generally comprise from about 5% to about 70% activeingredient. Suitable solid carriers and excipients are generally knownin the art and include, e.g. magnesium carbonate, magnesium stearate,talc, sugar, lactose, etc. Tablets, powders, cachets and capsules areall suitable dosage forms for oral administration.

Liquid dosage forms include solutions, suspensions and emulsions. Liquidform preparations may be administered by intravenous, intracerebral,intraperitoneal, parenteral or intramuscular injection or infusion.Sterile injectable formulations may comprise a sterile solution orsuspension of the active agent in a non-toxic, pharmaceuticallyacceptable diluent or solvent. Suitable diluents and solvents includesterile water, Ringer's solution and isotonic sodium chloride solution,etc. Liquid dosage forms also include solutions or sprays for intranasaladministration.

Aerosol preparations suitable for inhalation may include solutions andsolids in powder form, which may be combined with a pharmaceuticallyacceptable carrier, such as an inert compressed gas.

Also encompassed are dosage forms for transdermal administration,including creams, lotions, aerosols and/or emulsions. These dosage formsmay be included in transdermal patches of the matrix or reservoir type,which are generally known in the art.

Pharmaceutical preparations may be conveniently prepared in unit dosageform, according to standard procedures of pharmaceutical formulation.The quantity of active compound per unit dose may be varied according tothe nature of the active compound and the intended dosage regime.

The active agents are to be administered to human subjects in“therapeutically effective amounts”, which is taken to mean a dosagesufficient to provide a medically desirable result in the patient. Theexact dosage and frequency of administration of a therapeuticallyeffective amount of active agent will vary, depending on such factors asthe nature of the active substance, the dosage form and route ofadministration.

According to a eighth aspect of the current invention there is provideda method of identifying the transcription status of a chromosomal locus,said method comprising the steps of; identifying the antisense RNAtranscript profile expressed from said chromosomal locus; and comparingsaid profile with the antisense RNA transcript profile of saidchromosomal locus in a known state.

Preferably, said chromosomal locus comprises at least one gene.

Preferably, said gene is a gene known or suspected of being involved ina specific physiological condition.

It will be understood that said condition can be any conditionassociated with an epigenetic change, for example, cancer,cardiovascular disorders, inflammatory conditions, including autoimmunedisorders and inflammatory responses to infectious diseases, andinherited genetic disorders modulated by epigenetic mechanisms.

Preferably, said method is performed in vitro.

For the avoidance of doubt, it is stated that features described inrelation to one aspect of the invention are equally applicable to allother aspects of the invention. Furthermore, where a number of featuresare indicated as options, each individual feature is contemplated asbeing applicable individually or in combination with any other featuredescribed in the application.

The invention will now be further described with reference to thefollowing examples and figures in which:—

FIG. 1. Conditional antisense transcripts at GAL10

Northern blots of total RNA probed with sense and antisense specificprobes at GAL10. A, B. Strain BY4741 was culture in galactose (lane 1),washed (lane 2) and transferred to medium containing glucose for 15minutes (lane 3), 60 minutes (lane 4), 120 minutes (lane 5), 180 minutes(lane 6) or 360 minutes (lane 7). Two exposures of the antisense signalin B are shown. The position of the rRNA bands is indicated. The leftpanel in B and panel A were exposed for the same time. At 15 minutes the2.25 kb and the 2.4 kb GAL10 AS (see FIG. 3B) are both detectable as theswitch from the active to the repressed state occurs. C. Yeast culturedovernight in galactose were washed and transferred to fresh mediumcontaining the sugar indicated. In this experiment, high levels of theGAL10 and GAL10-7 fusion transcript are evident in galactose with theequivalent antisense transcripts. The sense exposure is 20% of theantisense. Twice as much RNA is loaded in lane 2*.

FIG. 2. Reverse transcription (RT)-PCR mapping transcripts at the GALlocus

A Position of RT primers for reverse transcription (RT) of the sense (S)or antisense (AS) transcript and nested primers for PCR amplification.Each set of primers is designed to amplify a region about 200-300 bp atthe sites shown at the GAL locus. B Mapping sense and antisensetranscripts in total RNA in three carbon sources across GAL7 and GAL10.The control lacking RTase for the RNA prepared in glucose is shown inlane 7. C Mapping antisense transcripts at GAL7 and GAL10 in total RNAprepared from cells grown in glucose. Controls include omitting the RTstep and a positive control for PCR primer efficiency on total DNA (notshown).

FIG. 3. Mapping transcripts at the GAL locus with strand-specificprobes.

A Autoradiographs of total RNA prepared from BY4741 (lane 1), W303-1a(lane 2) and YMH147 (lane 3) cultured in glucose, raffinose or galactoseand hybridized to single strand specific probes 1-7 (all designed todetect antisense transcripts with respect to GAL10). The positions ofthe rRNAs (RDN25 and RDN18) and tRNA are marked with black lines acrossthe autoradiographs, which were exposed for 24 hours. B Summary of datain FIG. 2 and FIG. 3A showing transcripts at repressed (glucose) orinduced (galactose) locus.

FIG. 4. Sequences at the 3′ end of GAL10 are required for induction ofthe sense transcript. A Schematic showing the position of the majortranscripts at repressed or induced GAL10 (WT) and a derivative of GAL10containing a pTEF:KanMK:TEFter (Mut) insertion creating a deletion of554 bp at the 3′ end of GAL10. B and C Northern blots of total RNA fromthe strains shown cultured in the carbon sources indicated. In B theposition of the rRNAs is indicated with dotted lines and sense andantisense specific probes were used. The autoradiographs were exposedfor equivalent times. In C cultures were induced and samples prepared at10 minutes intervals after transfer from raffinose (lane 1) to galactose(lanes 2-6).

FIG. 5. Hda1 and Eaf3 association with GAL10 is related to levels ofantisense transcript. Immunoprecipitation of chromatin prepared fromcells cultured overnight in galactose and transferred to fresh mediumcontaining galactose or glucose for 20 minutes and detected using realtime PCR at GAL10. A Hda1-myc normalized to the untagged control. B Eaf3normalized to the untagged control. C H3K18ac normalized to H3K18R andthen histone H3

FIG. 6. 3C with immunoprecipitation at the repressed GAL locus. A Map ofDpnII restriction sites (vertical lines) showing the number, positionand orientation of primers (arrows 3′ OH) and the approximate positionof the induced sense transcripts and the antisense transcript at therepressed locus. B 3C with IP (Rbp1 top three panels, Myc bottom panel)in the strains shown (BY4741 background) at the sites indicated. CControls for the standard reaction for including no formaldehydecrosslinking (-Form), no digestion (-dig), no DNA ligase, no ATP (-lig).On the gel is a 3C product (-IP step) and the standard 3C withimmunoprecipitation (Expt). PCR products for the long range interactionat GAL10 are shown. All other interactions showed similar dependencies(not shown). D 3C with IP (Rbp1) in the strains shown (BY4741background) at the sites indicated.

FIG. 7. Conditional long range interactions at the GAL locus. A Map ofDpnII restriction sites (vertical lines) showing the number, positionand orientation of primers (arrows 3′ OH) and the approximate positionof the induced sense transcripts and the antisense transcript at therepressed locus. B 3CIP (Rbp1) products at the sites indicated fromchromatin isolated from BY4741 cells grown in the carbon sourcesindicated. C 3C IP (Rbp1) over GAL10-7 and FMP27 in the WT strain(BY4741 background) grown in three different carbon sources. D A modelfor dynamic long range interactions at the repressed locus. Theantisense transcripts from GAL7 (shown), GAL10 and other non-codingtranscripts at the intergenic regions, together with the Hda1 and Eaf3lysine deacetylases are proposed to create an environment suitable forlong range interactions. In glucose, interactions across GAL10, GAL7 andGAL10-7 are detectable and a model to accommodate this is shown. Giventheir association with RNAPII and TFIIB, the long range interactionsappear to represent a poised but repressed state with no activetranscription at the locus. E A model for conditional long rangeinteractions at the GAL locus. As at the repressed locus, long rangeinteractions appear to represent a poised non-active state over thegenes. An increase in dynamic switching involving long rangeinteractions at GAL10 and GAL7 when glucose repression is removed (i.e.in raffinose) would lead to loss of the long range interaction overGAL10-7. The onset of active transcription would contribute to dynamicswitching between expressed and repressed states. The production of theantisense transcripts at the induced locus is envisaged to play tworoles: to facilitate production of the sense transcript and through Eaf3and Hda1 to prime the region for repression and the formation of longrange interactions when the gene is not actively expressed.

FIG. 8. In vivo CTCF-dependent transcriptional system. a, The integratedLuciferase reporter gene contains the wild type 90 bp CTCF-binding siteN-Myc (pN-MycLuc wt) or a mutant deficient for CTCF binding (pN-MycLucwt), in place of the promoter¹³. b, The enlarged map of the integratedconstruct with the positions of the primers indicated. Primers for ChIPassays at the N-Myc sites are shown in red and described previously¹³.Primers at the 5′ are depicted in brown, at the 3′ ends in blue; primersused in 3C and 4C assays are indicated in green. These primers aredescribed under Materials and Methods in the on-line Supplement. Therelevant TaqI restriction sites are shown as pale blue triangles; theirpositions are marked in relation to the Luciferase transcription startsite. c, Presence of the wild type CTCF binding site is sufficient todrive the expression of Luciferase in pN-MycLuc wt; treatment withalpha-amanitin dramatically reduces the activity of Luciferase. d, CTCFand Pol II are bound to the N-Myc site in pN-MycLuc wt, but not inpN-MycLuc mut, as shown by ChIP assay with the indicated antibodies. e,Treatment with alpha-amanitin leads to dissociation of Pol II, but notCTCF from the N-Myc site, as shown by ChIP assay with the indicatedantibodies. In panels “d” and “e”, primers used for the analysis areshown in red in FIG. 1 b and described earlier¹³.

FIG. 9. Chromatin Conformation Capture (3C) and ChIP assays. a, The 5′and 3′ ends (green arrows in FIG. 1 b) are juxtaposed in pN-MycLuc wt(Lane 3), but not in N-MycLuc mut (Lane 4) as revealed by the 3C assay.Lane 1 is the positive control for the ligated template, Lane 2 is thenegative ligation control. Primers used for the analysis are describedunder Materials and Methods in the on-line Supplement. b, The 5′ (Lane3) and 3′ (Lane 5) sites of pN-MycLuc wt, but not pN-MycLuc mut (Lanes 4and 6), are occupied by Pol II, as revealed by the ChIP assay with theanti-Pol II antibody. Lane 1—input, Lane 2—preimmune serum; c, The 5′(Lane 3) and 3′ (Lane 5) sites of pN-MycLuc wt, but not pN-MycLuc mut(Lanes 4 and 6), are occupied by CTCF. Lane 1—input, Lane 2—preimmuneserum. In panels “b” and “c” the 5′-end specific primers are shown inbrown and the 3′-end specific primers are shown in blue.

FIG. 10. The 4C assay (combination of ChIP and 3C assay) with theanti-Pol II and CTCF antibodies. Primers used for the analysis are shownin green in FIG. 1 b and described under Materials and Methods in theon-line Supplement. a, Pol II is present at the juxtaposition of the 5′and 3′ sites of the pN-MycLuc wt (Lane 3), but not of the pN-MycLuc mut(Lane 4). treatment with alpha-amanitin leads to the disappearance ofPol II from the high order chromatin structures in pN-MycLuc wt (lane5). b, CTCF is present at the juxtaposition of the 5′ and 3′ sites ofthe pN-MycLuc wt (Lane 3), but not of the pN-MycLuc mut (Lane 4). CTCFis still associated with the high order chromatin structure in pN-MycLucwt following treatment with alpha-amanitin (Lane 5).

FIG. 11. A model showing how CTCF can link transcription with thehigh-order chromatin structures. a, CTCF binds to two sites in the 5′and 3′ ends of the pN-MycLuc wt. b, The high order structure isestablished between the 5′ and 3′ ends of the pN-MycLuc wt via CTCF. c,Pol II binds to CTCF and initiates transcription; d, Following theinhibition of transcription and removal of Pol II the remainingstructure can still be detected due to its association with CTCF.

EXAMPLE 1 Epigenetic Control of the GAL Locus Conditional AntisenseTranscripts at GAL10

Addition of glucose to a culture of cells growing in galactose resultsin rapid inhibition of transcription. As expected, within 15 minutes ofaddition of glucose levels of both the 2.25 kb GAL10 transcript and thelonger 4.1 kb GAL10-7 fusion transcript (Greger and Proudfoot, 1998,Embo J 17, 4771-4779; St. John and Davis, 1981, J Mol Biol 152, 285-315)drop dramatically (FIG. 1A). An antisense transcript has been reportedat the repressed GAL10 gene (Perocchi et al, 2007, Nucleic Acids Res 35,e128; Samanta et al, 2006, Proc Natl Acad Sci USA 103, 4192-4197; Davidet al, 2006, Proc Natl Acad Sci USA 103, 5320-5325; Miura et al, 2006,PNAS 103, 17846-17851). The inventors have investigated whetherantisense transcripts are a general feature of the GAL10 gene and, ifso, when these transcripts appear.

In induced cultures, three antisense transcripts where observed (FIG.1B), two of which are similar in size to the 2.25 kb and the 4.1 kbsense transcripts at the induced GAL locus (FIG. 1A). The thirddetectable antisense transcript is smaller, about 1.5 kb. Theautoradiographs in FIG. 1A and 1B were hybridized to probes of similarspecific activity and were exposed for similar lengths of timesuggesting that at the active GAL10 gene, antisense transcripts make upa significant proportion of the total RNA. Moreover, even the endogenous4.1 kb GAL10-7 transcript, extended over both genes (FIG. 1A, C) ismatched by an equivalent antisense transcript (FIG. 1B, C).

During glucose repression, the abundance of the GAL10 antisensetranscripts drops considerably (FIG. 1B). As a result, two exposures ofthe northern blot in FIG. 1B are shown for clarity. The data in FIG. 1demonstrates that the size of the antisense transcripts at the repressedlocus also changes. Kinetic studies indicate that 15 minutes afteraddition of glucose to the culture, a switch is observed from the 4.1kb, 2.25 kb and 1.5 kb antisense transcripts to three transcripts, about3.5 kb, 2.4 kb and 1.5 kb. Both the 2.25 kb and the 2.4 kb transcriptcan be detected 15 minutes after glucose addition. Thus the abundanceand size of the antisense transcripts at GAL10 reflects whether the geneis induced or repressed raising the interesting possibility that theantisense transcripts might play regulatory roles at the GAL locus inboth conditions.

Mapping the Transcripts Around GAL10 Using RT-PCR.

Reverse transcription (RT) with strand specific primers coupled to PCRwas used to determine the position of the sense and antisensetranscripts around GAL10 in cells cultured in glucose, raffinose andgalactose (FIG. 2). The strategy used here employed RT primers, specificfor detection of either sense (S) or antisense (AS) transcripts andnested PCR primers to yield products approximately 200 bp (FIG. 2A).First the 2.25 kb GAL10 sense transcript and its equivalent antisensewere analysed. These are relatively abundant transcripts amenable tomapping by RT-PCR (FIG. 2B). The GAL10 sense transcript can be detectedwith primer sets 6S to 8S located across the GAL10 coding region only insamples prepared from cells cultured in galactose (FIG. 2B, lane 2). Asignal from primer sets 6AS to 8AS, designed to reverse transcribe andamplify antisense transcripts, is also evident only when the cells arecultured in galactose (FIG. 2B, lane 5). This suggests that the senseand antisense transcripts at induced GAL10 extend over the same regions.

In glucose and raffinose, however, were detected signals with theantisense specific primer sets 7AS and 8AS but not with 6AS or any ofequivalent primers for the sense transcript. This is in agreement withthe global microarray mapping which shows the transcript arising withinthe GAL10 coding region about 500 bp from the 3′ end in repressingconditions (Perocchi et al, 2007, Nucleic Acids Res 35, e128; David etal, 2006, Proc Natl Acad Sci USA 103, 5320-5325). This suggests thereare two distinct antisense transcripts over GAL10 corresponding to theinduced and repressed state. If the 2.4 kb antisense transcript at therepressed locus starts within the GAL10 coding region it is likely toextend into the GAL10-1 intergenic region, confirmed with signals fromprimer sets 9 and 10 (FIG. 2C).

Signals for primer sets 5S and 5AS, located over the GAL10-7 intergenicregion, are much weaker suggesting that the majority of the sense andantisense transcripts terminate or initiate, respectively, within thisregion. It, was noted that sense and antisense specific primer sets forthe GAL7 region also revealed evidence for sense and antisensetranscripts (FIG. 2C). In this case, primer sets 3S and 3AS, spanningthe end of the ORF and the 3′ region, were able to detect the sensetranscript but not the antisense, suggesting that the antisensetranscript is promoted further upstream. The data indicate that thetranscripts might be offset with respect to one another. It is concludedthat GAL10 and GAL7 produce both sense and antisense transcripts whencells are induced with galactose and that the transcripts are similar insize.

Mapping the Antisense Transcripts Around GAL10 Using Northern Blots.

Northern blots with strand specific probes were used to identify theapproximate position of the antisense transcripts in cells cultured inglucose, raffinose and galactose around GAL10 from three differentstrain backgrounds (FIG. 3). The northern blots have the advantage ofbeing able to correlate the size of a transcript with its position atthe GAL locus. Probes 2 and 3 routinely showed poor hybridization withhigh background signals making the hybridization to long transcriptshard to discern.

In induced culture high levels of antisense RNA to GAL7 and GAL10 areevident (probes 2 and 4) confirming the RT-PCR data in FIG. 2. The 2.25kb antisense at induced GAL10 shows slight hybridization with probe 3but none with probe 5 suggesting that it is confined mainly to the GAL10ORF region. Probe 6 hybridizes to the GAL1 sense transcript as GAL1 ison the Watson strand. Two other transcripts are only produced in inducedcells. These both arise within GAL1 and extend on the sense strandthrough the GAL1 terminator and into FUR4. One transcript isapproximately >5 kb and the second is much larger.

In repressed cultures, the 2.4 kb (GAL10) antisense transcript showedstrong hybridization to a short probe in the GAL10-1 intergenic regionand less strongly to a GAL1 probe. No hybridization to these probes isseen with RNA prepared from cells cultured in galactose, consistent withthe PCR mapping in FIG. 2. This suggests the GAL10 antisense transcriptat the repressed locus arises about 500 bp from the 3′ end of the GAL10ORF and extends over the GAL10-1 intergenic region. In a similar way,the longer 3. 5 kb antisense GAL10 transcript also shows stronghybridization to these probes suggesting it also extends across theGAL10-1 intergenic region and into GAL1 (FIG. 3A).

Four small non coding transcripts are evident in cells cultured in bothrepressing and inducing conditions (FIG. 3A). Notably, these transcriptsoccur predominantly over the intergenic regions at the locus. The firstis a 1 kb transcript detectable at the GAL7-10 intergenic region with aprobe in the antisense orientation with respect to the GAL10 and GAL7transcripts (Probe 3). The second is a transcript of approximately 600bp that hybridizes with a sense orientation probe to the GAL1 terminator(Probe 7). The third transcript is about 1.7 kb and extends antisensefrom the 5′ region of KAP104 and through the GAL7 terminator region(Probe 1). The fourth is about 1.5 kb and initiates within GAL10 andextends into the GAL10-1 intergenic region (Probe 5).

In this analysis, RNA was prepared from three different strainbackgrounds. The BY4741 (lane 1) and W303-1a (lane 2) strains producedsimilar profiles in all three carbon sources. In the YMH147 strain (lane3), however, expression of the GAL locus appears to be derepressed inraffinose. At some regions, the transcript profile in this strain isdifferent to that in glucose or galactose (compare for example lanes 1and 3 in the three carbon sources at hybridized with Probes 3, 6 and 7).

A summary of the mapping data (from FIGS. 1-3) is shown in FIG. 3B.There are three notable points. The first is the relatively high levelof antisense transcripts for each of the sense transcripts when cellsare induced with galactose (GAL10, GAL7 and GAL10-7). Sense andantisense transcripts at the induced locus appear to be paired andsynergistic. The second is the change in the size of the antisensetranscripts on repression (in glucose) coupled to changes in thepredicted initiation and termination sites at GAL10. Third is thepresence of transcripts over the intergenic regions (GAL7 terminator;GAL10-7 intergenic region; GAL10-1 intergenic region; GAL1 terminator)at the both the repressed and induced locus.

Sequences at the 3′ Region of GAL10 are Required for Induction of theSense Transcript.

The data suggests that the position of antisense transcripts over atGAL10 reflect whether the gene is repressed or induced. Thesetranscripts are likely to be initiated at different sites (FIG. 4A). Theinventors designed an experiment to identify sequences required forinduced antisense transcription (likely to arise at or near the 3′flanking sequences of GAL10) by separating the 3′ flanking region fromthe main gene. This construct allowed testing for a putative promoterand examination of its effect on induction of the sense transcript. 554bp between bases 2453 and 3007 were deleted leaving 92 bp at the 3′ endof the GAL10 ORF and the 3′ flanking sequences intact. The deletedregion was also replaced by an expression cassette in the same directionas GAL10 expression. RNA was prepared from cells cultured in raffinoseand transferred to galactose for 15 and 60 minutes, sufficient forstrong induction of the sense transcript and the appearance of theinduced 2.3 kb antisense transcript in the wild-type (FIG. 4B lane 5 and7). The insertion at 3′end of GAL10 is sufficient to compromiseexpression of both the 2.4 kb and the 3.5 kb GAL10 antisense transcripts(FIG. 4B, lanes 6 and 8). Importantly, very low levels of sensetranscript in these samples were observed (FIG. 4B, lanes 6 and 8).Thus, despite the promoter region for sense transcription being intactproduction of the sense RNA is compromised in the strain with theinsertion at 3′ end of GAL10. This raises the possibility that theantisense transcript is required for transcription from the GAL10promoter. The induction profiles at GAL7 and GAL1 (FIG. 4C) were alsoexamined in the mutant strain. No obvious effect on the repressed levelsat the GAL cluster was observed. In addition, there was little effect ofthis insertion on induction of transcripts at GAL7 or GAL1.

Thus sequences at the 3′ end of GAL10 are required for the production ofthe induced antisense transcript which is in turn implicated inefficient transcription of the induced sense transcript. This suggestscoordination of events between the 5′ and 3′ ends of the induced gene.

Sequences at the 3′ Region of GAL10 are not Required for AntisenseTranscription at the Repressed Locus.

At the repressed locus, the antisense transcripts are likely to initiateat a different position within the GAL10 ORF (FIG. 4A). The insertion atthe 3′end of GAL10 allowed us to ask whether similar sequences arerequired for antisense transcription at the repressed locus as at theinduced locus (FIG. 4B). Surprisingly, only a small reduction in thesize of the two antisense transcripts in the mutant compared to the WTwas noted. The levels and the relative size difference between the longand short antisense remained very similar to those in the WT. Thissuggests that sequences at which these transcripts normally initiatehave been deleted or disrupted by the 3′ end insertion and that bothtranscripts are initiating at a new site, slightly closer to the 5′region of GAL10. This raises question of how these transcripts arepromoted, given that their normal initiation site has been removed. Oneexplanation is the sequences in the inserted expression cassette.However, as these new transcripts are not present in the inducedconditions, it is unlikely that they are promoted from sequences withinthe inserted cassette. Thus there are different sequence requirementsfor the induced antisense transcript and the antisense at the repressedlocus.

Hda1 Recruitment to GAL10 Reflects Levels of Antisense Transcript butnot H3K18Ac.

As there is significantly more antisense transcript produced when GAL10is in the induced compared to the repressed state, experiments toascertain if Hda1 association at GAL10 reflects either the level ofantisense RNA or whether expression of the gene is induced or repressedwere undertaken. Hda1-myc association with GAL10 was assessed by ChIPusing chromatin prepared from induced or repressed cells (FIG. 5A).Eaf3, a component of both the Rpd3S lysine deacetylase and the NuA4lysine acetyltransferase (Allard et al, 1999, EMBO J 18, 5108-5119) wasused as a control. Strains lacking Eaf3 show high levels of histoneacetylation at the promoter and within the coding region of genesinstead of the normal profile in which levels are high at the promoterand drop in the coding region, suggesting a major effect of loss of Eaf3on the Rpd3S complex in particular (Reid et al, 2004, Mol Cell Biol 24,757-764).

Hda1-myc is associated with both the 5′ and the 3′ region of GAL10 ininduced cells. The signal for Hda1-myc drops about 5 fold on therepressed chromatin. This difference is not due to differences in thelevels of Hda1-myc in the cells cultured in repressed or inducingconditions (data not shown). This data is consistent with the presenceof high levels of antisense transcript in induced cells and high levelsof Hda1 across GAL10. Eaf3 shows a similar but less pronounced trendshowing lower levels of association at both the promoter and the 3′ endof GAL10 on repression (FIG. 5B).

To identify if there is a relationship between Hda1 association andhistone acetylation at repressed and induced GAL10 H3K18ac, a knownsubstrate for Hda1 was examined. Levels of H3K18ac at the repressed geneare low, similar to levels in an H3K18R strain, consistent with activedeacetylation by Hda1. In induced cells levels of H3K18ac aresignificantly higher than in repressed cells, despite high levels ofHda1 in the induced strain (FIG. 5C). This difference is present evenwhen nucleosome loss that accompanies active transcription at GAL10 istaken into account and the H3K18ac signal is normalized to H3 levels.This suggests no direct correlation between Hda1 association and H3K18acat GAL10. If Hda1 activity is related to antisense transcripts asproposed by Camblong et al, 2007 Cell 131, 706-717, then the high levelsof H3K18ac on the induced gene, particularly at the 5′ region, can beexplained by the histone acetylation that accompanies transcriptionalactivation and elongation of the sense transcript. The balance ofacetylation and deacetylation would be shifted to result in a net gainof H3K18ac on the induced gene.

Long-Range Chromatin Interactions at the GAL Locus.

Long-range chromatin interactions, also known as gene loops, have beendescribed at a limited number of active yeast genes (Ansari and Hampsey,2005, Genes Dev 19, 2969-2978; O'Sullivan et al, 2004, Nat Genet 36,1014-1018; Singh and Hampsey, 2007, Mol Cell 27, 806-816). Theseinteractions represent juxtaposition of the 5′ and 3′ regions of yeastgenes and are associated with RNAPII, TFIIB and the CPF transcriptiontermination machinery. Given the data showing active antisensetranscription at the repressed GAL locus and 3′ to 5′ end communicationat the induced GAL locus, investigation to ascertain if long-rangeinteractions are implicated in antisense regulation of GAL expressionwas undertaken. The GAL locus was analysed for the presence of longrange interactions and, by including an immunoprecipitation step,whether these interactions are associated with RNAPII and TFIIB.

Interactions between the GAL10 5′ and 3′ regions, the GAL7 5′ and 3′region and the 5′ region of GAL10 with the 3′ region of GAL7 from cellscultured in glucose, raffinose and galactose were monitored using amodified 3C (capturing chromosome conformation) technique (Dekker, 2006,Nat Methods 3, 17-21; Dekker et al, 2002, Science 295, 1306-1311;O'Sullivan et al, 2004, Nat Genet 36, 1014-1018) (FIG. 6A). As a controlthe previously detected long range interaction at FMP27, which isassociated with RNAPII and is present in all three growth conditions wasused (see FIG. 7C).

Long Range Interactions at the Repressed GAL Locus.

Long range interactions are detectable over GAL10, GAL7 and between the5′ region of GAL10 and the 3′ region of GAL7 (GAL10-7) in cells culturedin glucose (FIG. 6B). Moreover, these interactions are associated withRNAPII as the 3C interactions can be immunoprecipitated with antibodiesto Rbp1. Controls for the GAL10 and the GAL10-7 interactions show thatthe PCR products are specific and depend on formaldehyde crosslinking,digestion of the chromatin with DpnII, the ligation reaction andaddition of template to the PCR reaction (FIG. 6C). In some reactionsmore than one PCR product was observed. These PCR products were isolatedfrom gels and the sequences over the DpnII junctions were analyzed. Thisrevealed, for example, that the DpnII site abutting primer set 6 at theGAL10-1 intergenic region is partially protected in the cross-linkedchromatin preparation, resulting in a product of about 550 bp inaddition to the 260 bp product expected for the long range interactionover GAL10. Similar analyses at the three long range interaction sitesconfirmed each of the PCR products observed is dependent on formaldehydecrosslinking and is consistent with valid long range interaction showingjuxtaposition of distant sequences.

The limited number of long range interactions described to date, havebeen observed on actively transcribed genes. The interactions describedhere occur at a locus that is repressed for GAL expression although thepresence of antisense transcripts suggests the locus istranscriptionally active. The next step was to examine whether themechanism driving loop formation at repressed loci is similar to that atactive genes. Long range interactions are reduced in a strain carryingthe sua7-1 allele, expressing a version of TFIIB with an E62K (glutamicacid 62 to lysine) substitution (Singh and Hampsey, 2007, Mol Cell 27,806-816). This mutation is defective in interactions at the 3′ regionbut not at the 5′ region of active genes. In repressed conditions, theGAL10, GAL7, GAL10-7 and the control FMP27 long range interactions allshowed dependence on functional TFIIB suggesting that loops on repressedloci have the same requirements as those on active genes (FIG. 6D). Inaddition, the GAL10 and the GAL7 long range interactions, but not thatGAL10-7, can be enriched in immunoprecipitates from a strain expressingTFIIB-myc compared to an untagged strain (FIG. 6B). This suggests thatTFIIB-myc is closely associated with the chromatin regions that showlong range interactions at GAL10 and GAL7. The reason for the inabilityto detect TFIIB associated with the long range interaction across GAL10and GAL7 is not clear, as functional TFIIB is required for theinteraction. One possibility is that the epitope tag is not accessibleat this interaction. Thus long range interactions can be detected at therepressed GAL locus and these interactions show the same requirements asgene loops on active genes.

A Conditional Long Range Interaction at the GAL Locus.

Long range interactions for GAL10 and GAL7 are also observed inraffinose, a repressing growth medium, and in galactose when the genesare expressed (FIG. 7B). In both cases, the interaction is associatedwith RNAPII. The long range interaction between the 5′ region of GAL10and the 3′ region of GAL7 (GAL10-7), however, depends on the carbonsource in the growth medium (FIG. 7B, C). Most importantly, this longrange interaction is lost in induced conditions (galactose) and evenwhen glucose repression is removed (in raffinose). Thus, it is unlikelythat disruption of the long range interaction can be explained simply bythe high levels of sense transcription which are observed under inducingconditions. Rather, the conditionality of the GAL10-7 long rangeinteraction appears to be related to loss of glucose repression.

Long Range Interactions at the Gal Locus are Influenced by LysineDeacetylases.

Non-coding transcripts and long range interactions are present at theGAL locus in both repressing and inducing conditions. The presence ofnon-coding transcripts is related to the long range interactions wasinvestigated. The antisense transcripts are associated with the Hda1lysine deacetylases (FIG. 5) (Camblong et a/, 2007 Cell 131, 706-717).At the repressed and induced GAL10 locus, the level of chromatinassociated Hda1 correlates with level of antisense transcript. It wasqueried if Hda1 and Eaf3 are required for the long range interactions atthe GAL locus and at FMP27.

Loss of Hda1 has a dramatic effect on the long range interactions overGAL10, GAL7 and GAL10-7 in repressed cells (FIG. 6B). The effect onFMP27 is much less dramatic. By contrast, Eaf3 is required for the longrange interaction at FMP27 and GAL10, but there is a lesser requirementfor this factor at GAL7 or GAL10-7. Some loci, such as FMP27 and GAL7show specificity for one or the other complex, while loci such as GAL10appear to require both activities. This data implicate the lysinedeacetylases, Hda1 and Rpd3S, directly or indirectly, in the formation along range interactions and suggest that this may be linked to antisensetranscripts.

Discussion

The inventors have used the GAL locus as a model system in which tocompare the induced and repressed states and observe differences inantisense transcript and epigenetic regulation. They show that generepression is a proactive state involving the regulated production ofantisense transcripts in association with epigenetic changes to thelocus.

The non-coding transcript map at the GAL locus is complex andconditional. The presence of relative short transcripts over theintergenic regions is reminiscent of promoter associated transcriptsassociated with yeast genes such SER3 (Martens et al, 2004, Trends Genet23, 126-133), IMD2, LEU4 (Davis and Ares, 2006, Proc Natl Acad Sci USA103, 3262-3267) and Ty1 (Berretta et al, 2008 Genes Dev 22, 615-626) ormammalian genes such as DHFR (Martianov et al, 2007, Nature 445,666-670). As these are present at both the induced and repressed GALlocus they are unlikely to have regulatory functions related toactivation and repression directly but may influence other aspect oflocus topology.

The size and position of the non-coding antisense transcripts at GAL10change with growth conditions. At the induced locus, there is oneabundant antisense transcript whose levels rise and fall synergisticallywith the sense transcript. The data suggests that sequences at the 3′flanking region of GAL10 are required to promote expression of thistranscript. In addition to binding sites for the Gal4 regulator (thisregion also contains the promoter for GAL7) there are also Reb1 bindingsites in this region. As this is a conditional transcript, a dual rolefor Gal4 in activating both the GAL7 transcript and the GAL10 antisenseis possible. However, the inventors have also mapped a high levelantisense transcript arising at the 3′ region of induced GAL7 and thereare no Gal4 binding sites in this region. Given that loss of the GAL10antisense transcript is associated with low levels of the GAL10 sensetranscript, and the demonstration of long range interactions between the3′ and 5′ region of GAL10 and GAL7, an alternative possibility is thatpromoter sequences play a role in activating in trans the antisensetranscripts.

The antisense transcripts at induced GAL10 share properties with twogenes previously shown to be associated with antisense RNA. Like PHO5(Uhler et al, 2007, Proc Natl Acad Sci USA 104, 8011-8016), theantisense transcript at induced GAL10 is linked to transcription of thesense transcript. The GAL10 transcript is different from that at PHO5 asit does not appear to extend into the promoter region and thus isunlikely to function in the way proposed for PHO5 by remodeling promoterchromatin. Like PHO84 (Camblong et al, 2007 Cell 131, 706-717), thepresence of antisense transcript correlates with the association of theHda1 lysine deacetylase (KDAC) with chromatin. Although lysinedeacetylases are associated with both activation and repression of geneexpression (Bernstein et al, 2000, Proc Natl Acad Sci USA 97,13708-13713), at PHO84 Hda1 functions with the antisense transcript torepress the sense promoter (Camblong et al, 2007 Cell 131, 706-717).Paradoxically, high levels of H3K18ac are maintained over the activeGAL10 gene suggesting that acetylation associated with sensetranscription shifts the dynamic balance towards acetylation. Underlyingthis is a ground state of active repression through KDACs and antisensetranscripts. The inventors suggest that this ground state is representedin part by the long range interactions over GAL10 and GAL7 (FIG. 7E). Inthis model the long range interactions, although associated with RNAPIIand TFIIB, would represent a poised not active transcription aspreviously envisaged (Ansari and Hampsey, 2005, Genes Dev 19, 2969-2978;O'Sullivan et al, 2004, Nat Genet 36, 1014-1018; Singh and Hampsey,2007, Mol Cell 27, 806-816). In yeast, flies and mammals many genesexist in a stable poised state with engaged RNAPII and the associatedepigenetic modifications at the promoter (Guenther et al, 2007, Cell130, 77-88; Radonjic et al, 2005, Mol Cell 18, 171-18; Zeitlinger et al,2007, Nat Genet 39, 1512-1516).

At the induced GAL locus, antisense transcription extends over the samegeneral region as the sense transcription. On repression there is aswitch in the initiation site and a change in the nature of thesequences required to promote antisense transcription. Moreover, theantisense transcript becomes dominant. Long exposures of Northern blots,however, reveal very low levels of equivalently sized sense transcriptssuggesting that the relationship between sense and antisensetranscription is maintained even on a repressed gene. This reinforcesthe idea of “active” repression and supports the repressed state being avariation of the events that occur on activation. One naturalconsequence of dominant antisense transcripts is active repressionthrough KDACs such as Hda1 and Eaf3. It is interesting that at therepressed locus the antisense transcripts extend over the GAL10-1intergenic region in a similar way to the exosome regulated antisensetranscript at PHO84 (Camblong et al, 2007, Cell 131, 706-717769).Deacetylation over this region and other intergenic region may promoteor stabilize long range interactions at the locus, for example theinteraction over GAL10-7.

Also prominent on the map are long transcripts extending from one geneto another. These transcripts are conditional, for example the GAL1:FUR4long transcript at the induced locus or the GAL10-1 antisense transcriptat the repressed locus. Long non-coding transcripts are observed at theβ-globin locus in mammalian cells and may be involved in conditionalswitches (Gribnau et al, 2000, Mol Cell 5, 377-386). In yeast, thesetranscripts may simply reflect poor transcript processing.Alternatively, they may be the first indication of two different typesof transcription event (repressive or activating) driving or breakinghigher orders of organization of yeast genes.

Experimental Procedures Strains

Three different strain backgrounds were used in this study: BY4741 (MATahis3Δ1, leu2Δ0 met15Δ0 ura3Δ0), W303-1a (MATa, ura3-52, leu2-3-112,his3-11, ade2-1, can1-100, trp1Δ2) and YMH14 (MATα, cyc1-5000, cyc7-67,ura3-52, leu2-3-112, cyh2). The sua7-1 allele is in the YMH14 background(Pinto et al, 1994, J Biol Chem 269, 30569-30573). Strains, includingepitope tagged derivatives, truncations and gene deletions wereconstructed by single step gene replacement using PCR-generated DNAfragments (Longtine et al, 1998, Yeast 14, 953-961). A pTEF:KanMX:TEFterwas inserted into the 3′ region of GAL10 resulting in loss of residues2453 and 3007 with respect to the ATG. Transcription of the selectablemarker is in the same direction as GAL10 sense transcription. SUA7 wastagged at the C terminus with the myc epitope in BY4741. hda1Δ and eaf3Δdeletions were constructed in the BY4741 background.

Media and Culture Conditions.

Growth media were prepared using standard methods in YE supplementedwith 2% glucose, raffinose or galactose as required. Yeast were takenfrom fresh plates and grown to an OD 600 of 0.6 to 0.8 in 50 to 100 ml.Yeast were harvested by centrifugation and washed in H₂O before transferto fresh medium.

Chromatin Immunoprecipitation Protocol

Chromatin immunoprecipitation was performed as described (Meluh andBroach, 1999, Nature 445, 666-670; Morillon et al, 2005, Mol Cell 18,723-734). In summary, ChIP was done using 50 ml cultures fixed with 1%formaldehyde for 15 minutes followed by addition of glycine at 0.25 mMfinal. Yeast cells were broken using glass beads on a MagnaLyser (Roche)and fixed chromatin sheared by sonication using a bioruptor (Diagenode).Average DNA fragment lengths were 150 to 300 bp. After centrifugation(30 min 10K, 4° C.), the soluble chromatin was incubated with antibodyto the following epitopes; 5 μl of H3 (Abcam), 5 μl of H3K18ac(Upstate), 20 μl of Eaf3 (Abcam), 10μ of Y80 (Santa Cruz) and 10 μl ofmyc (Sigma) in 1.5 ml siliconised Eppendorfs at 4° C. for 15 to 20 hoursand immunoprecipitated with protein A sepharose for 90 minutes at roomtemperature. After washing, the chromatin was eluted from the beads at65° C. for 30 minutes. Cross-links were reversed by incubation at 65° C.for 6 to 20 hours and treated with protease and RNase A. DNA waspurified using Qiagen PCR mini-columns and eluted in 100 μl water. IPsamples and controls e.g. no antibody, no tag, were used neat whilecontrol DNAs (input) were diluted accordingly. Samples were subject toreal time PCR using a Corbett Rotorgene and Sybr Green mix (Sensymix,Quantace). Real time PCR was used to amplify regions corresponding tothose shown at GAL10. Data was calculated (IP-No antibody)/TOT andexpressed as a percentage of input. Error bars reflect the standarddeviation of the average signal obtained between different experiments(n=2 to 4).

Generation of Strand-Specific Probes

A T7 promoter was incorporated onto the end of specific region of DNAusing PCR. T7 RNA polymerase was used to generate single stranded probeswith specificity for the sense or antisense strand of DNA using ³²P αUTPand the Ambion MAXIscript® Kit (Cat #AM1308-AM1326).

Northern Blotting

15 μg of total RNA, prepared from cells using hot phenol:chloroform andglass beads, was separated on 1.1% formaldehyde gels and transferred toMagna nylon membranes and baked at 80° C. for 2 hours then hybridizedfor overnight in PerfectHyb Plus (Sigma) at 64° C., washed twice in1×SSC/O0.1% SDS, twice in 0.2×SSC. 0.1% SDS for 20 minutes each wash.Membranes were typically exposed for 24 hours unless otherwise stated.Levels of total RNA loaded was monitored by the rRNA species, which areequal across samples unless indicated.

Reverse Transcription-PCR.

For each of the ten positions across GAL10-7 in FIG. 2, there is an RTprimer for the sense transcript (with respect to the direction of GAL10expression), an RT primer for the antisense transcript, two alternativeprimers for the first round PCR and two alternative nested primers forthe amplification. Details of the primers are given in supplementarytable 1. First strand synthesis was done using the ABgene Verso cDNA kitwith primers specific for the sense or antisense transcripts. A standardreaction was set up with 2 ul of RNA; primer (25 μM) 1 μl; H₂O 9 μl,heating to 70° C. for 5 min to remove any secondary structure and placeon ice immediately. Then buffer 4 μl, dNTP 2 μl, RT 1 μl and RT enhancer1 μl were added and the reaction incubated at 52° C. for 50 min, thenfor 2 min at 95° C. In RT control, reverse transcriptase was not addedto the reactions. PCR amplification was a nested reaction using TakaRaDNA polymerase in the following reactions: 2× buffer I, 25 μl; dNTP (2.5mM), 8 μl; cDNA, 2 μl; Primers (25 μM), Forward 1 μl, Reverse 1 μl;TakaRa LA Taq polymerase, 0.5 μl; H₂O 12.5 μl at 94° C.—5 min, 24 cyclesof 94° C.—1 min, 55° C.—45 sec, 72° C.—30 sec and 72° C.—5 min. For thesecond round 2 μl of the first round product was used with the nestedprimers in a reaction as above: 94° C.—5 min, the 18 cycles at 94° C.—45sec, 54° C.—30 sec, 72° C.—20 sec and finally an incubation at 72° C.—5min.

Capturing Chromosome Conformation with Immunoprecipitation (3CIP)

Nuclei were extracted from 100 ml Saccharomyces cerevisiae culture grownin appropriate medium to optical density A600=0.2. Formaldehyde wasadded to 1% (2.44 ml of 41%) and shaken for 10 minutes. The formaldehydewas quenched by adding glycine to 0.125M (5 ml of 2.5M). The cell pelletwas washed twice in Mg/K buffer (0.1 M K2HPO4/KH2PO4 (35:65 ratio), 5 mMMgCl2, pH 6.5) and resuspended in spheroplasting buffer (1.2M sorbitol,500 U yeast lytic enzyme and 25 mM DTT in Mg/K buffer) for 15 minutes atroom temperature. Spheroplasts were washed once in MES buffer (0.1M MES,1.2M sorbitol, 1 mM EDTA, 0.5 mM MgCl2 adjusted to pH 6.4 using NaOH) at4° C. and resuspended in 10 ml MES lysis buffer (0.1M MES, 1 mM EDTA,0.5 mM MgCl2 pH6.4). The spheroplasts were lysed using 10 strokes with ahand held homogeniser. The lysate was layered onto a sucrose gradient (5ml 1.8M sucrose, 10 ml 1.1M sucrose in MES lysis in a Corex tube) andseparated by centrifugation for 10 min at 10,000 RPM in Beckman JA-17rotor. The nuclei pellet was located at the interface on the glass wall.The pellet at the bottom of the tube is removed using a water wash anddiscarded. The nuclei pellet was washed of the glass with CSK buffer(100 mM NaCl, 300 mM sucrose, 10 mM PIPES, 3 mM MgCl2, 1 mM EGTA, 0.5%Triton X-100, 10 M leupeptin, 1:1000 AEBSF) at 4° C., washed again andresuspended in 1 ml of CSK buffer and left for 20 minutes on ice. Thenuclei were pelleted and all but ˜100 μl of the supernatant removed. 40μl 5M NaCl was added and incubated for 10 min on ice. The viscousmixture was diluted with 1.2 ml H₂O. Antibody was added and the mixturerotated at 4° C. overnight. ˜40 μl of protein G-sepharose slurry (20-30μl of beads) was prepared by washing twice in H₂O and once with 1 ml ofrestriction wash buffer (50 mM Tris-HCl (pH=8.1), 100 mM NaCl, 10 mMMgCl₂) and centrifugation at 2000 rpm for 3 min to collect the beads.The chromatin mixture was incubated with rotation for 60 minutes and thebeads collected by centrifugation at 1000 rpm for 3 min, washed 3 timeswith 1 ml restriction wash buffer by rotate at 4° C. for 5 min andspinning at 2000 rpm for 3 min. 10 μl of 10× buffer 3, 50 U restrictionenzyme and water to 100 μl and the chromatin digested overnight at 37°C. overnight for DpnII, or 25° C. for CviQI, then at 65° C. for 10 minto kill restriction enzyme. Heat insensitive enzymes such as CviQI wereremoved by washing the beads twice with restriction wash buffer. Themixture was diluted and ligated with 410 μl of H2O, 60 μl of 10×ligation buffer 30 μl (12000 U) T4 ligase and incubated at 16° C. for 4hrs. The mixture was incubated overnight at 65° C. to de-crosslink. 1 μlof 1 mg/ml RNAase A was added and incubated at 37° C. for 30 minfollowed by 60 μl of 20 mg/ml Proteinase K and incubated at 42° C. forone hour. The DNA was extracted using 660 μl phenol:chloroform:isoamylalcohol and precipitated with 30 μl of 5M NaCl, 0.5 μl 10 mg/ml glycogenand 1 ml cold ethanol and incubated at −80° C. for one hour. The pelletwas collected, washed and resuspended in 20 μl H₂O. The followingcontrols were included in the protocol as recommended (Dekker, 2006, NatMethods 3, 17-21): The immunoprecipitation step was excluded to do astraight 3C procedure. For both the 3C and the 3C with IP, RNAase orProteinase K treatments were included before the ligation step todemonstrate dependence on RNA or protein. The protocol was conducted onnuclei isolated without the formaldehyde treatment step. Theimmunoprecipitation steps were done after restriction and ligation stepusing a standard ChIP protocol (see above). The products of the reactionwere detected using nested PCR using TakaRa polymerase in a 50 μlreaction. Primer stocks were 25 μM. The first reaction contained 25 μlof GC buffer I, 8 μl dNTP solution (1.25 mM each), 1 μl template, 1 μlof each primer, 13.5 μl H₂O, 0.5 μl TakaRa DNA polymerase (5 U/μl) for25 cycles (94° C. 5 min—[94° C. 45 s, 60° C. 30 s, 72° C. 20 s]—72° C. 5min). The second reaction contained 25 μl of GC buffer I, 8 μl dNTPsolution (1.25 mM each), 2 μl template from the first reaction, 1μ ofeach primer, 12.5 μl H₂O, 0.5 μl TakaRa DNA polymerase (5 U/μl) for 18cycles (94° C. 5 min—[94° C. 45 s, 61° C. 30 s, 72° C. 20 s]—72° C. 5min). As there are four possible products for each long rangeinteraction, all combinations of primers, except forward:forward, wereincluded in the initial analysis. Only data with one primer combinationis shown. The primer orientation (forward F; reverse R) is givenrelative to the direction of ORF. Each set is nested with an inner (i)and outer (o) primer (see Table 2). PCR reactions were controlled byomitting template or DNA polymerase. Templates to control for primerefficiency were prepared by ligating DpnII restricted genomic DNA. Thecontrol templates and experimental templates were titrated to determinethe linear range of amplification; only one equivalent product on thisrange is shown for each sample.

TABLE 1 Primers for RT-PCR analysis 5′          3′RT Sense Gene Specific primers 1. S1KAP104 CTGTAAAAGAGTTGC 2. INF1GAL7CTGCAACATCCAAT 3. INF2GAL7 AAGGACCACTCTTAC 4. S1GAL7 CATGTGAAACCAAC 5.INFIGAL10 CTACTTTACCAAACG 6. S1GAL10 CAAGGTTACACAATC 7. S2GAL10CTTCACCAGCAGTC 8. S3GAL10 GCAAGATAGCAAAC 9. S4GAL10 TTAGCTCTACCACAG 10.INF1GAL1 TGGTTATGAAGAGG RT Anti Sense Gene Specific primers 1. INR1GAL7CAAGGCTCATTGTC 2. INR2GAL7 ACGGAGTGACAATA 3. AS2GAL7 CTTGGTTGGTTTTG 4.AS1GAL7 TGGTGCTTAGAATC 5. INR1GAL10 CTACAGATTTTCCTG 6. AS1GAL10AGGTGATCTATTGGT 7. AS2GAL10 GCAAGATTTGTGAC 8. AS3GAL10 GTTTTGGTTACAGG 9.INR1GAL1 GTGAAGACGAGGAC 10. S1GAL1 CAATCACTTCTTCTGFirst round PCR Primers 1Fwd. FKAP1041 ATAGTCTTCGGCGGGCTTC 1Rev.RKAP1041 CGATGGAAATCCTGCACCTA 2Fwd: FGAL71 CGTAATAAACTTCAACAGAGCCTAAA2Rev. RGAL71 TTCTAGTTATGTAAGAGTGGTCCTTTC 3Fwd. FGAL73TTGTCACTCCGTTCAAGTCG 3Rev. RGAL73 GCCTCAAAGAGATTTAACTTCG 4Fwd. FGAL75ACCAGTCGCATTCAAAGGAG 4Rev. RGAL75 TGAAGTTTCGCAAGAATTGAAA 5Fwd. FGAL101GCGCTTCGCAATAGTTGT 5Rev. RGAL101 TTGCCAGCTTACTATCCTTCTTG 6Fwd. FGAL103CATCAATGTATCTACCAGGCTCA 6Rev. RGAL103 AAATTGACTGCTGGTGAAGC 7Fwd. FG10AS2ATTTTGAATGATGGGTCCC 7Rev. RG10AS2 AGATTTCAAGCCACGTTTGC 8Fwd. FGAL105TGGCGTATTTCGTATGACCA 8Rev. RGAL105 TGTTGCTGATAACCTGTCGAA 9Fwd. FGAL107TGGATGGACGCAAAGAAGTT 9Rev. RGAL107 GCTCGGCGGCTTCTAATC 10 Fwd. FGAL11CGAATCAAATTAACAACCATAGGA 10Rev. RGAL11 AATACAAACTGAAAATGTTGAAAGTNested PCR primers 1Fwd. FKAP1042 AAAACCAAAGACTGCGGAAT 1Rev. RKAP1042TCCGGGTTATAGAGTTTTGCTT 2Fwd. FGAL72 TGTCAATAAAGTGGAAATGTGTCA 2Rev.RGAL72 GAATTTTAGGAATACAATGCAGCTT 3Fwd. FGAL74 GAAACCAGGCAGTTAATAGAAAAA3Rev. RGAL74 GCTGCTGAAAAACTAAGAAA 4Fwd. FGAL76 TTAAAATCGAGGCGAGGTC 4Rev.RGAL76 TGATTTGTTTGCCGATTACG 5Fwd. FGAL102 GCGGCTCGTGCTATATTCTT 5Rev.RGAL102 TTGCTGTATAACGAATTTTATGC 6Fwd. FGAL104 CCAGCAGACAAGAAATCACC 6Rev.RGAL104 TTGAGGGTACGGAGATTATGG 7Fwd. FG10AS1 ATTAACGCCGTTATTAACG 7Rev.RG10AS1 TTCTTGGCTATGAAAATGAGG 8Fwd. FGAL106 GGGAATCTCGTAGCATCACC 8Rev.RGAL106 TGTGTGACCGAAAAGGTCTG 9Fwd. FGAL108 TGTTGTGGAAATGTAAAGAGC 9Rev.RGAL108 GCAATGAGCAGTTAAGCGTATT 10Fwd. FGAL12 TTTTTAGCCTTATTTCTGGGGTAA10Rev. RGAL12 AAGTGGTTATGCAGCTTTTCC

TABLE 2 Primers for 3C analysis Primer designation Sequence (5′-3′)1 GAL7 terminator R(o) GCTCATTGTCGGTGTCGTTA 1 GAL7 terminator R(i)CGATGGAAATCCTGCACCTA 2 GAL7 terminator F(o) TTGTCACTCCGTTCAAGTCG2 GAL7 terminator F(i) TCCGAAGTTAAATCTCTTTGAGG3 GAL10-7 intergenic R(o)  TTGCTTTGCCTCTCCTTTTG3 GAL10-7 intergenic R(i)  CGTTTGGTAAAGTAGAGGGGGTA4 GAL10-7 intergenic F(o)  CGCACCATAATCTCCGTACC4 GAL10-7 intergenic F(i)  CGCTTCACCAGCAGTCAAT 5 GAL10 promoter R(o)GGGCCTACTAATCCGTATGGT 5 GAL10 promoter R(i) TCCCAGAAGAATGTCCCTTAG6 GAL10 promoter F(o) GAGGAAAAATTGGCAGTAACCT 6 GAL10 promoter F(i)GCCCCACAAACCTTCAAAT 7 FMP27 promoter F(o) ATCAAAGCCACGCCAAAC7 FMP27 promoter F(i) CCTACACGCAAAGGAACTAGAGA 8 FMP27 terminator F(o)AGCAAACCGAACATCAAACC

DpnII: Primer pairs 4+6, 5+3 and 3+6 will detect long range interactionsacross GAL10. Primer pairs 2+4, 3+1 and 1+4 will detect interactionsacross GAL7. Primer pairs 6+2 and 5+1 and 1+6 will detect interactionsacross GAL10-7. For the analyses shown in FIG. 5 only the last of theprimer combinations for each interaction is shown.

Table 3 shows potential chromosomal positions across the Gal locus wherelong range interactions may occur. For each region of the chromosome, aset of forward and reverse primers is designed. Long range interactionat the gal locus is monitored by 3C analysis between the primersdesigned for each region of the chromosome. For example, to monitorinteraction between Gal 7 and Gal 10 regions, the primers of Row 3(274081-87) and Row 5 (278016-19) will be used. If interactions at otherregions is to be monitored, other combinations of primers will be used.

TABLE 3 Chromosome Primer start Primer stop Organism Number positionposition Strand Yeast II 273036 273043 + Yeast II 273126 273130 + YeastII 274081 274087 + Yeast II 274838 274852 + Yeast II 278017 278019 +Yeast II 278022 278022 + Yeast II 278025 278026 + Yeast II 279321279331 + Yeast II 279952 279962 + Yeast II 281268 281284 + Yeast II279941 279959 − Yeast II 279595 279595 − Yeast II 274080 274084 − YeastII 273684 273694 −

This system is equally applicable to any other chromosomal locus wherelong range chromosomal interactions are thought to occur. Once theregion is identified, primers can be designed to identify the presenceor absence of a specified long range interaction indicating a particularphysiological condition.

EXAMPLE II Control of Long Range Interactions by CTCF

Several transcription factors have been shown to play a role both in thelong-range DNA interactions and transcription and therefore may be goodcandidates to provide a link between these processes. Examples includethe basic transcription factor, TFIIB which was shown to organizelooping of several genes in the yeast Mol Cell, 27, 806-16, andtranscription factors EKLF, GATA-1 and FOG-1 responsible for long-rangeDNA interactions in the β-globin gene (Drissen et al, 2004, Genes Dev,18, 2485-90; Vakoc, et al, 2005, Mol Cell, 17, 453-62).

The inventors have identified CCCTC-binding protein (CTCF) as anothercandidate to perform these functions genome-wide. CTCF is implicatedboth in transcriptional regulation and formation of high-orderconformational intra- and inter chromosomal structures (Klenova, 2002,Semin Cancer Biol, 12, 399-414; Kurukuti et al, 2006, Proc Natl Acad SciUSA, 103, 10684-9; Zhao et al, 2006, Nat Genet, 38, 1341-7; Splinter etal, 2006, Genes Dev, 20, 2349-54). It is estimated that thereare >15,000 of CTCF-binding sites in the genome (Kim et al, 2007, Cell,128, 1231-45), however it is likely that the real number of suchsites >30,000 (Vetchinova et al, 2006, Anal Biochem, 354, 85-93). Intranscription, CTCF can act as a classical transcription factor; arecent report demonstrates that CTCF may control transcription directlythrough it's interaction with RNA Polymerase II (Pol II) (Chernukhin etal, 2007, Mol Cell Biol, 27, 1631-48). CTCF-Pol II co-localization atthe transcription start sites (TSS) of active genes genome-wide furtherstrengthens this possibility (Birney et al, 2007, Nature, 447, 799-816).Finally, CTCF can form dimers which may be important for organization ofDNA loops (Pant et al, 2004, Mol Cell Biol, 24, 3497-504).

Unique properties of CTCF have prompted the inventors to investigatewhether it can mechanistically link the formation of high orderchromosomal structures and transcription, possibly via its associationwith Pol II. To investigate this minimal in vivo transcription cellsystems based on two genetically modified NIH3T3 cell lines have beenused. These lines carry stably integrated expression vectors containingthe CTCF binding site and its mutated variant deficient for CTCFbinding, fused to the promoter-less Luciferase reporter gene (pN-MycLucwt and pN-MycLuc mut, FIGS. 8 a, b).

The wild type single site, but not its mutant variant, was sufficient todrive expression from the reporter Luciferase gene (FIG. 8 c)(Chernukhin et al, 2007, Mol Cell Biol, 27, 1631-48). Both, CTCF and PolII, were present at the wild type, but not the mutant CTCF binding site(FIG. 8 d), thus supporting the earlier model that CTCF helps to recruitPol II in the absence of the endogenous promoter.

In this system transcription processes may be linked with the formationof high order DNA structures, in particular between the 5′ and 3′regions of the integrated DNA pN-MycLuc wt. High-order conformationalstructures can be monitored by the Chromosomal Conformation Capture (3C)assay, which detects close proximity of the distant sites on thechromosomal DNA in vivo. The inventors have applied the 3C analysis tothe integrated pN-MycLuc wt and pN-MycLuc mut loci. Two sites at the 5′position and 3′ position (FIG. 8 b) were identified as juxtaposed in the3C assay in pN-MycLuc wt, but not in pN-MycLuc mut (FIG. 9 a).

On the basis of the earlier work on the CTCF interaction andco-localization with Pol II, the inventors hypothesised that CTCF andPol II may be linked to the formation of high-order structures on thetranscribed pN-MycLuc wt gene. To investigate this, work has beenundertaken to identify whether both factors are present at the newlyidentified juxtaposed sites (FIG. 9 a). Chromatin immunoprecipitation(ChIP) assay revealed that indeed both CTCF and Pol II are present atthe 5′- and 3′ sites, identified as juxtaposed, in pN-MycLuc wt, but notin pN-MycLuc mut (FIG. 9 b, c).

The N-Myc is a known CTCF target site and the characteristic features ofthe sequences within N-Myc involved in CTCF binding were previouslyinvestigated (Chernukhin et al, 2007, Mol Cell Biol, 27, 1631-48; Lutzet al, 2003, Embo J, 22, 1579-87). The high frequency of occurrence ofCTCF binding sites in the genomes led us to hypothesize that there maybe another potential CTCF target site at the 3′ end of the Luciferasegene integrated pN-MycLuc wt (FIG. 9 c). As it is difficult to predictCTCF binding from the sequence, Electrophoretic mobility shift assay(EMSA) and footprinting analysis was used to investigate if CTCF candirectly bind to the identified sequences. Indeed, the binding wasdemonstrated by EMSA with the labelled probe containing the 3′ sequencesand recombinant CTCF. Footprint analysis of this sequence revealed theregion protected from DNase I digestion by recombinant CTCF. Thus, takentogether the ChIP, EMSA and footprinting data confirm that CTCF indeedcan bind to the 3′ site. Therefore it is conceivable that two CTCFmolecules bound to the two sites can be involved in the formation of thejuxtaposition.

To further confirm the involvement of CTCF and Pol II in theestablishment of high-order structures, the 4C assay (ChIP assays witheither anti-Pol II or anti-CTCF antibodies followed by the 3C) wereperformed. The Pol II 4C and CTCF 4C analyses demonstrated the presenceof Pol II and CTCF at the 5′ and 3′ sites during juxtaposition inpN-MycLuc wt, but not in pN-MycLuc mut (FIG. 10 a, b). From theseexperiments it was concluded that the establishment of thejuxtapositions between the 5′ and the 3′ regions of the wild type activeconstruct, pN-MycLuc wt, was associated with the presence of CTCF andPol II at the identified sites and also with the wild type status of theconstruct.

Further analysed dependency of the observed phenomenon on transcriptionwas undertaken. For this purpose cells were treated with the inhibitorof transcription, alpha-amanitin. The treatment abolished the activityof pN-MycLuc wt and pN-MycLuc mut (FIG. 8 c) and led to thedisappearance of Pol II, but not CTCF from the N-myc site (FIG. 8 e).The Pol II 4C assay performed with the anti-Pol II antibody on treatedcells did not reveal the presence of Pol II at the juxtaposition betweenthe 5′ and 3′ sites. However this juxtaposition was detected with theanti-CTCF antibody (FIG. 10 a, b). The 3C analysis of cells treated withalpha-amanitin confirmed the existence of the structure (FIG. 9 a)

The inventors also tested whether the recombinant CTCF mixed in vitrowith the linearised naked plasmid DNA, pN-MycLuc wt, could be sufficientto form juxtapositions detected in vivo. Using a 4C assay (ChIP assay incombination with 3C), structures similar to the structures formed invivo were detected in this basic system. Significantly weaker signalswere observed with the pN-MycLuc mut construct used as a control thusindicating that the detected structures were dependent on two intactCTCF sites. The presence of CTCF binding in the pN-MycLuc wt, wasconfirmed by ChIP assay for both 5′ and 3′ sites.

Taken together the data suggest that transcriptional processes requirethe formation of high order structures, however high order structures,in the reported case dependent on CTCF, exist without the ongoingtranscription. These findings support observations that long-range DNAinteractions in the β-globin gene are maintained after inhibition oftranscription (Palstra et al, 2008, PLoS ONE 3, e1661).

The inventors propose a model, in which the establishment of the highorder structure between the 5′ and 3′ ends of the pN-MycLuc wt isCTCF-dependent (FIG. 11). In this model, interaction between two CTCFmolecules positioned at two distant sites leads to formation of CTCFdimer and the establishment of the DNA loop (Klenova et al, 2005, CellCycle, 4, 96-101). Transcriptional processes can be initiated after theestablishment of this configuration. Following the inhibition oftranscription and removal of Pol II, the juxtaposed structure can stillbe detected most likely due to its association with CTCF (FIG. 11).

This example makes use of a very simple transcription system. In thissystem, transcription from the promoter-less Luciferase construct wasdriven by CTCF interacting with Pol II through the CTCF binding site,N-Myc (Chernukhin et al, 2007, Mol Cell Biol, 27, 1631-48). It wasdiscovered that the transcription process relied on the juxtapositionbetween the 5′ N-Myc and the 3′end of the Luciferase gene; the secondCTCF binding site was identified within the juxtaposed 3′ end. It isconcluded that numerous transient interactions take place continuouslybetween CTCF molecules bound to DNA in cis and trans. Stabilisation ofsuch quasi-stable high order chromosomal associations may be a regulatedprocess; poly ADP-ribosylation of CTCF may be involved in suchregulation (Klenova et al, 2005, Cell Cycle, 4, 96-101; Yu et al, 2004,Nat Genet, 36, 1105-10). One of the outcomes of the formation of highorder structures may be initiation of a transcriptional process.

Using a minimal transcription system it was identified that CTCF isinvolved in the establishment and maintenance of the high-orderchromatin structures, which in turn are required for ongoingtranscription by RNA Polymerase II.

All references cited herein are incorporated in their entirety.

1. A method of monitoring epigenetic changes comprising monitoringchanges in conditional long range chromosomal interactions at least onechromosomal locus where the spectrum of long range interaction isassociated with a specific physiological condition, said methodcomprising the steps of: (i) in vitro crosslinking of said long rangechromosomal interactions present at the at least one chromosomal locus;(ii) isolating the cross linked DNA from said chromosomal locus; (iii)subjecting said cross linked DNA to restriction digestion with an enzymethat cuts at least once within the at least one chromosomal locus; (iv)ligating said cross linked cleaved DNA ends to form DNA loops; (v)identifying the presence of said DNA loops; wherein the presence of DNAloops indicates the presence of a specific long range chromosomalinteraction.
 2. The method according to claim 1, wherein the presence ofthe DNA loops are identified using PCR techniques.
 3. The methodaccording to claim 1, wherein the presence of a DNA loop indicates analtered transcription state indicative of a specific physiologicalcondition.
 4. The method according to any one of claim 1, wherein saidphysiological condition is selected from amongst cancer, cardiovasculardisorders, inflammatory conditions, including autoimmune disorders andinflammatory responses to infectious diseases, and inherited geneticdisorders modulated by epigenetic mechanisms.
 5. A method of monitoringepigenetic changes comprising monitoring changes in conditional longrange chromosomal interactions at least one chromosomal locus where thespectrum of long range interaction is associated with a specificphysiological condition, said method comprising the step of identifyinga change in the antisense RNA profile expressed from the at least onechromosomal locus.
 6. The method according to claim 5, wherein thechange in the antisense RNA profile is a change in the size startposition, and/or number of antisense RNA transcripts.
 7. A method ofdiagnosing a disorder associated with at least one epigenetic change ina subject, said method comprising identifying in a sample previouslyisolated from the subject a change in one or more long range chromosomalinteractions at least one chromosomal locus associated with saiddisorder; wherein said method comprises the method of any one ofclaim
 1. 8. The method according to claim 7, wherein the epigeneticchange results in altered transcription from the chromosomal locus. 9.The method according to claim 7, wherein the change in the alteredtranscription is up regulation, repression, or production of analternative transcript.
 10. The method according to any one of claim 7,wherein the epigenetic change causes a change in the expression of atleast one gene.
 11. A method of regulating transcription of at least onegene in a patient suffering from a disorder associated with altered geneexpression, said method comprising administering to said patient anantisense RNA in an amount effective to alter transcription of said atleast one gene.
 12. The method according to claim 11, wherein saiddisorder results from over expression of said gene.
 13. The methodaccording to claim 11, wherein said disorder results from repression ofsaid gene.
 14. The method according to claim 11, wherein said disorderresults from production of an altered gene product.
 15. The methodaccording to any one of claim 11, wherein said antisense RNA targets atleast one CTCF binding site.
 16. The method according to any one ofclaim 11, wherein administration of said antisense RNA results inmodulation of HDAC enzymes.
 17. Antisense RNA for use in the treatmentof a disorder associated with altered gene expression, wherein saidantisense RNA regulates transcription of said gene.
 18. The antisenseRNA according to claim 17, wherein said RNA represses transcription ofsaid gene.
 19. The antisense RNA according to claim 17, wherein said RNAinduces transcription of said gene.
 20. A method of identifying thetranscription status of a chromosomal locus comprising the steps of;identifying the antisense RNA transcript profile expressed from saidchromosomal locus; and comparing said profile with the antisense RNAtranscript profile of said chromosomal locus in a known state.
 21. Themethod of claim 20, wherein said method is performed in vitro. 3000.TAGMATION.